Estimating Sampling Selection Bias in Human Genetics: A Phenomenological Approach
نویسندگان
چکیده
This research is the first empirical attempt to calculate the various components of the hidden bias associated with the sampling strategies routinely-used in human genetics, with special reference to surname-based strategies. We reconstructed surname distributions of 26 Italian communities with different demographic features across the last six centuries (years 1447-2001). The degree of overlapping between "reference founding core" distributions and the distributions obtained from sampling the present day communities by probabilistic and selective methods was quantified under different conditions and models. When taking into account only one individual per surname (low kinship model), the average discrepancy was 59.5%, with a peak of 84% by random sampling. When multiple individuals per surname were considered (high kinship model), the discrepancy decreased by 8-30% at the cost of a larger variance. Criteria aimed at maximizing locally-spread patrilineages and long-term residency appeared to be affected by recent gene flows much more than expected. Selection of the more frequent family names following low kinship criteria proved to be a suitable approach only for historically stable communities. In any other case true random sampling, despite its high variance, did not return more biased estimates than other selective methods. Our results indicate that the sampling of individuals bearing historically documented surnames (founders' method) should be applied, especially when studying the male-specific genome, to prevent an over-stratification of ancient and recent genetic components that heavily biases inferences and statistics.
منابع مشابه
Maximum likelihood estimation of population growth rates based on the coalescent.
We describe a method for co-estimating 4Nemu (four times the product of effective population size and neutral mutation rate) and population growth rate from sequence samples using Metropolis-Hastings sampling. Population growth (or decline) is assumed to be exponential. The estimates of growth rate are biased upwards, especially when 4Nemu is low; there is also a slight upwards bias in the esti...
متن کاملتورش روشهای آنالیز استاندارد در برآورد اثرات علیتی
Standard methods for estimating exposure effects in longitudinal studies will result in biased estimates of the exposure effect in the presence of time-dependent confounders affected by past exposure. In the present review article, we first described the assumptions required for estimating the causal effect in longitudinal studies and their structure regarding various types of exposure and ...
متن کاملCodon bias patterns in photosynthetic genes of halophytic grass Aeluropus littoralis
Codon bias refers to the differences in the frequency of occurrence of synonymous codons in coding DNA. Pattern of codon and optimum codon utilization is significantly different between the lives. This difference is due to the long term function of natural selection and evolution process. Genetics drift, mutation and regulation of gene expression are the main reasons for codon bias. In this stu...
متن کاملEstimating and Adjusting for Publication Bias Using Data Augmentation in Bayesian Meta-Analysis
We introduce a Bayesian approach which estimates and adjusts for selection bias in a set of studies used in a meta-analysis. We use a hierarchical model for study outcome, and propose an additional model component to account for publication bias, which is the possibility that studies of interest are not equally likely to be published and hence observed studies are not a random sample. Estimatio...
متن کاملTracing selection effects in three non-probability samples.
Snowball sampling and targeted sampling are widely applied techniques to recruit samples from hidden populations, such as problematic drug users. The disadvantage is that they yield non-probability samples which cannot be generalised to the population. Despite thorough preparatory mapping procedures, selection effects continue to occur. This paper proposes an interpretation frame that allows es...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 10 شماره
صفحات -
تاریخ انتشار 2015